Skip to content

Optimize arrays_zip for perfectly aligned arrays#22245

Open
BipashaBi wants to merge 2 commits into
apache:mainfrom
BipashaBi:arrays-zip-fast-path
Open

Optimize arrays_zip for perfectly aligned arrays#22245
BipashaBi wants to merge 2 commits into
apache:mainfrom
BipashaBi:arrays-zip-fast-path

Conversation

@BipashaBi
Copy link
Copy Markdown

@BipashaBi BipashaBi commented May 16, 2026

Which issue does this PR close?

Rationale for this change

arrays_zip currently reconstructs output arrays row-by-row using MutableArrayData even when all input arrays are perfectly aligned and require no null padding.

In the perfectly aligned case, the output can reuse the existing child arrays and offsets directly, avoiding unnecessary copying and allocations.

What changes are included in this PR?

  • Added a fast-path optimization for perfectly aligned arrays
  • Added is_perfect_zip() validation helper
  • Added try_fast_path() zero-copy execution path
  • Reused child value arrays directly when possible
  • Reused list offsets instead of rebuilding them
  • Preserved the existing implementation as a fallback for ragged/null-padded inputs
  • Added a dedicated fast-path test

Are these changes tested?

  • Added a dedicated fast-path test for perfectly aligned arrays
  • Ran cargo check
  • Ran cargo test --package datafusion-functions-nested
  • Verified existing tests continue to pass

Are there any user-facing changes?

No user-facing changes. This is an internal performance optimization for arrays_zip.

@github-actions github-actions Bot added the functions Changes to functions implementation label May 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant